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ATTORNEY /AGENT I NY/ R MAT ION : 
(A) NAME: ?e:;^:: y/ Joanne ?. . 
! 3'~ REGISTRATION NUMBER: 4 2, 935 
■! - ; /0 E DCC r ' E T N UM3 E ?. : 4 o 0 0 - CO -3 3 . 2 4 

;i:<) TELECOMMUNICATION I N FORMAT I ON : 
(A) TELEPHONE: (650) 324-0380 
(3) TELEFAX: ;650) 324-0960 

(2) INFORMATION FOR SE n 10 NO:i: 

(l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 1295 case oairs 
(3) TYPE: nucleic acid 
(C) STRANDEDNES5 : -double 
ID) TOPOLOGY: linear 

(iii MOLECULE TYPE: DNA 

(iii) HYPOTHETICAL: NO 

(iv! ANTI-SENSE: NO 

(vi; ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: 1.33 kb EcoRI insert of ET1.1, 
rorward sequence 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1..1293 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 2 . . 12 94 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 3. . 1295 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

AGACCTGTCC CTGTTGCAGC TGTTCTACCA CCCTGCCCCG AGCTCGAACA GGGCCTTCTC 60 

TACCTGCCCC AGGAGCTCAC CACCTGTGAT AGTGTCGTAA CATTTGAATT AACAGACATT 12 0 

GTGCACTGCC GCATGGCCGC CCCGAGCCAG CGCAAGGCCG TGCTGTCCAC ACTCGTGGGC 180 

CGCTACGGCG GTCGCACAAA GCTCTACAAT GCTTC 2CA0T CTGATGTTCG CGACTOTCTC 24 0 

GCCCGTTTTA TCOO 2GC OAT TGG2 2C2 OTA CAGGTIACAA OTTGTGAATT GTACGAGCTA 30 0 

GT3GAGGCCA TGGTCGAGAA GGGCOAGGAT GGCTC2GCZG TCCTTGAGCT TGATCTTTGC 360 

AACCGTGACG TGTCOAGGAT CAC CTTCTTC CAGAAAGAIT GTAACAAGTT CACOAOAGGT 420 

GAGACCATTG CCCAT-3GTAA A3TGGGC0AG GGOATOTOGG CCTGGAGCAA GACOTTCTGC 4 30 

GCOCTCTTTG GOOOTTGGTT OCGOGCTATT GAGAAGGOTA TTCTGGCOOT GCTOCOTCAG 54 0 

GGTGTGTTTT AOGGTGATGO CTTT3AT 3 AC AOCGTOTTCT CGGCGGCTG7 GGCCGGAGCA 600 
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102 0 
108 0 
1140 
1200 



AAGGCATCCA TGGCGCTHA GAATGA2TTT TCTGAGTTTG ACTCCACCCA GAATAACTTT 660 

TCTCTGGG7C 7AGAGTGTGC TATIA7 3 GAG GAGTGTGGGA TGGCGGAGTG GCTCATCCGC 720 

CTGTATCACC TTATAAGGTG TGCGTGGATC TTGCAGGCCC CGAAGGAGTC TCTGCGAGGG 730 

TTTTGGAAGA AACACTCCGG T^AGCCCGGC ACTCTTCTAT GGAATACTGT CTGGAATATG 84 0 

GCCG TTATTA CCCACTGTTA TGACTTCCGC GATTTTCAGG TGGCTGCCTT TAAAGGTGAT 900 

GATTCGATAG TGCTTTGCAG TGAGTATCGT CAGAGTCCAG GAGCTGCTGT CCTGATCGCG 360 
GGCTGTGGCT TGAAGTTGAA GGTAGATTTG CGCCCGATCG GTTTGTATGC AGGTGTTGTG 
GTGGCCCCCG GCCTTGGCGC GCTCGOTGAT GTTGTGCGCT TCGCOGGCCG GCTTACCGAG 
AAGAATTGGG GCGCTGGCCC TGAGGGGGCG GAGCAGCTCC GCGTCGC7GT TAGTGATTTC 
CTCCGCAAGC TCACGAAT GT AGCTCAGATG TGTGTGGATG TTGTTTCCCG TGTTTATGGG 

GTTTCCCCTG GACTCGTTCA TAACCTGATT GGCATGCTAC AGGCTGTTGC TGATGGCAAG 1260 

GCACATTTCA CTGAGTCAGT AAAACCAGTG CTCGA 12 9 5 

(2) INFORMATION FOR SEQ ID NO: 2: 

(l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 31 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Arg Pro Val Pro Val Ala Ala Val Leu Pro Pro Cys Pro Glu Leu Glu 
1 5 10 15 

Gin Gly Leu Leu Tyr Leu Pro Gin Glu Leu Thr Thr Cys Asp Ser Val 
20 25 30 

Val Thr Phe Glu Leu Thr Asp He Val His Cys Arg Met Ala Ala Pro 
35 40 45 

Ser Gin Arg Lys Ala Val Leu Ser Thr Leu Val Gly Arg Tyr Gly Gly 
50 55 60 

Arg Thr Lys Leu Tyr Asn Ala Ser His Ser Asd Val Arg Asp Ser Leu 

65 7 0 75 80 

Ala Arg Phe lie Pro Ala He Gly Pro Val Gin Val Thr Thr Cys Glu 
85 90 95 

Leu Tyr Glu Leu Val Glu Ala Met: Val Glu Lys Gly Gin Asd Gly Ser 
100 105 no 

Ala Val Leu Glu Leu Asp Leu Cys Asn Arg Asd Val Ser Arg He Th^ 
115 120 * 125 

Phe Phe Gin Lys Asp Cys Asn Lys Phe Thr Thr Gly Glu Thr He Ala 
130 135 HQ 
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ms Hy Lys 7,3 1 Gly Gin Glv lie Ser Ala Trc Ser Lvs Th^ p1 ^^ r ys 
-^5 150 * 5 5 ' 



A a 



Fro Trc ?r.e Arg Ala He Glu Lys Ala He Leu Ala 



1 /o 



Leu Leu Pro Sin Gly Val Phe Tyr Gly Asd Ala Phe Asp Asp Thr Vai 
130 185 190 

Phe Ser Ala Ala Val Ala Ala Ala Lys Ala Ser Met Val °he Glu Asn 
195 200 205 

Asp Phe Ser Glu Phe Asp Ser Thr Gin Asn Asn Phe Ser Leu Gly Leu 
^10 215 220 

Glu Gys Ala He Met Glu Glu Gys Gly Met Pro Gin Trp Leu lie Ara 

22 5 230 235 240 

Leu Tyr His Leu He Arg Ser Ala Trp He Leu Gin Ala Pro Lys Glu 
245 250 255 

Ser Leu Arg Gly Phe Trp Lys Lys His Ser Gly Glu Pro Gly Thr Leu 
260 265 270 

Leu Trp Asn Thr Val Trp Asn Met Ala Val He Thr His Cys Tyr Asd 

275 230 285 

Phe Arg Asp Phe Gin Val Ala Ala Phe Lys Gly Asd Asd Ser He Val 
290 295 300 

Leu Cys Ser Glu Tyr Arg Gin Ser Pro Gly Ala Ala Val Leu He Ala 
305 310 315 320 

Gly Cys Gly Leu Lys Leu Lys Val Asp Phe Arg Pro He Gly Leu Tyr 
325 330 335 

Ala Gly Val Val Val Ala Pro Gly Leu Gly Ala Leu Pro Asp Val Val 
340 345 350 

Arg Phe Ala Gly Arg Leu Thr Glu Lys Asn Trp Gly Pro Gly Pro Glu 
355 360 365 

Arg Ala Glu Gin Leu Arg Leu Ala Val Ser Asp Phe Leu Arq Lys Leu 
370 375 380 

Thr Asn Val Ala Gin Met Cys Val Asp Val Val Ser Arg Val Tyr Gly 

385 390 395 400 

Vai Ser Pro Gly Leu Val His Asn Leu He Gly Met Leu Gin Ala Vai 
405 410 415 

Ala Asp Gly Lys Ala His Phe Thr Glu Ser Val Lys Pro Val Leu 
420 425 430 

(2) INFORMATION FOR SEQ 10 NO: 3: 



(i; SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 13 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 



So 
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Civ) ANTI-SENSE: MO 

(vi) original source: 

[2) INC-IVIDu.-.L ISOLAIE: linker - top (5') sequence 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

GGAATTCGCG GCDGCTCS 

{2; INFORMATION FOR SEC ID NO : 4 : 

(i) SEQUENCE CHARACTERISTICS: 
A. LENGTH: 2 0 base pairs 

(3) TYPE: nucleic acid 
iC) STRANDEDNE5S : single 
f C : TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(iii) H Y ? ■ DC H E T I C AL : NO 

(iv) ANT I -SENSE: NO 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: linker - bottom (3') sequence 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 : 

CGAGCGGCCG CGAATTCCTT 2Q 

(2) INFORMATION FOR SEQ ID NO : 5 : 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1295 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 
(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: 1.33 kb EcoRI insert of ET1.1, 
reverse sequence 

Cxi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 
TC3AGCACTG GTTTTACTGA CTIAGTGAAA TGTGCCTTGC CATCAGCAAC AGCCTGTAGC 50 
ATGCCAATCA GG T TAT GAAC GAGTCCAGGG GAAACCCCAT AAACACGGGA AACAACATCC 120 
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-A'-^- ^ '-jr-.^- . -.^A* T CoTor-.GCTTG CGGAGGAAAT CACTAACA.GC GAGGGGGAGG 13 ] 

aoaacatgag G3A3CGC?3C AAGGCCGGGG 3CCACCACAA CACC7GCA7A CAAACCGATC 300 

GGGGGGAAAT CTAXTTCAA CTTGAAGCGA CAGCCGGCGA TGAGGAGAGG AGCTCCTGGA 360 

CTC7GAC3AT A:T:AGTGGA AAGCAC7A7C GAA7CATCAG C777AAAGGC AGCCACCTGA 4 20 

AAATCGC3GA AG7 OATAACA GTGGGTAATA ACGGCCATAT 7 GC AG AG AG T A77GCATAGA 4 30 

AGAGTGCOGG GCTCACCGGA GTG777G77G CAAAACCCTC GCAGAGACTC C77GGGGGGG 540 

TGCAAGAICC ACGGAGAC37 7A7AAGG7GA 7AGAGGGGGA 7GAGGCAG7G GGGGA7GGGA 6 00 

CAC7CC73CA 7AA7AGCACA C737AGACCC AGAGAAAAGT 7A773TGGGT GGAGTCAAAG 660 

TC AG AAAAG T 2AT737CAAA CACCATGGAT GCCTTTGCT3 CGGCCACAGC 0GCCGA3AA3 720 

AC3GTGT3AT 3AAAGGCATC ACGGTAAAAC ACAC3CTGAG GGAGGAGGGC CAGAATAGCC 7 30 

TTOTCAATAG C3C3 3AACCA AGG 3CCAAAG A3GGGGGAGA AGG7GT7GC7 CCAGGC OGAG 340 

A7GG 0G7 jGC C3ACTTTACC A7 3GGCAA7G G7G7GAGC7G TGG7GAAC77 GTTACAATCT 900 

TTCTGGAAGA A.3GTGATCCT GGAGACGTCA CGGTTGCAAA GATCAAGC7C AAGGAC 3G0G 960 

GAGCCATCGT GGCCCTTC7C GACCATGGCC TCCAGTAGCT CGTACAA7TC ACAAGTTGTA 1020 

ACCTGTACGG GGCCAATGGC CGGGATAAAA CGGGGGAGAG AGTCGCGAAC ATCAGAG7GG 1080 

GAAGCATTGT AGAGG777GT GCGACCGCCG 7A3CGGCCCA CGAGTGTGGA CAGCACGGGC 1140 

77GGGC7GGG 7GGGGGCGGC CA7GCGGCAG 7GGAGAATGT CTGTTAA7TC AAATGTTACG 12 00 

ACACTA7CAC AGG7GGTGAG C7CCTGGGGC AGGTAGAGAA GGCCCTGTTC GAGC7CGGGG 12 60 

CAGGGTGGTA GAACAGCTGC AACAGGGACA GGTCT 12 95 

(2) INFORMATION FOR SEQ ID NO : 6 : 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 7195 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) 70P0L0GY: linear 

(ii) MOLE3ULE TYPE : DNA 

(iii) HYPOTHETICAL: NO 

fiv) ANTI-SENSE: NO 

(Vi) ORIGINAL SOURCE : 

;C) INDIVIDUAL ISOLATE: HEV - Burma strain 



i,3) LOCA7ION: 23 . . 5106 
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:3) LOCATION: 5106.. 5474 

(xi) SEQUENCE DESCRI F7ICN : SZQ ID NO: 6: 

AGGCAGACCA CATATGTGGT CGATGC2A7G 3AGGCCCA7C AGT7TA7TAA GGCTCCTGGC 60 

atca:tactg ctattgag:a ggctg:t:ta ocagcggcca actctgc::t ggggaatgct 120 

37gg7ag77a "g 0c7t7 737 37373a3gag :a3a77gaga 7cc7ca77aa 3gtaa73caa 1-30 

"7"c:agc ptgitttgcg :::::3aggtt T73tg:aa73 atcc:a7c:a gcgtgtgatc 243 

0a7aac3agc 7ggag 37t7a c7 3cc 3cgcc c3c7cc3gc3 gctgt 37tga aattggggcc 300 

CATCCCCGCT CAATAAATGA TAATGG7AAT GTG3TCCACC G 3TGCTTCC7 CCGCCCTGTT 360 

G jGCGTGATG 77CAG03C7G G7A7AC7GCT CG ZACTCGCG G jGG 3GC7 3C 7AA77GCCGG 420 

C jTTCCGCGC 7GCGC3GGCT TC:C3 37GC7 GACCGCACTT A3TGCC7CGA C3GG7TTTC7 4 SO 

GGC7G7AACT 77CCCGCCGA GAG 7 3GCATC GCCCTC7ACT CCCTTCATGA TATG7CACCA 54 0 

TCTGATGTCG C7GAGGCCAT GTTCCGCCAT GG7ATGACGC GGCTC7ATGC CGCCCTCCAT 600 

C77CCGCCTG AGGTCCTGCT GCCCCCTGGC ACA7A7CGCA CCGCATCG7A TTTGCTAAT7 6 60 

CATGACGGTA GGCGCGTTGT GG7GACGTAT GAGGGTGATA C7AGTGCTGG TTACAACCAC 7 20 

GA7G7CTCCA ACTTGCGCTC CTGGATTAGA ACCACCAAGG TTACCGGAGA CCATCCCCTC 73 0 

GTTATCGAGC GGGTTAGGGC CA77GGCTGC CACTTTGTTC TCTTGCTCAC GGCAGCCCCG 84 0 

GAGCCATCAC C7ATGCCTTA TGT7CCTTAC CCCCGG7CTA CCGAGGTCTA TGTCCGATCG 900 

ATCTTCGGCC CGGG7GGCAC CCCTTCCTTA TTCCCAACCT CATGCTCCAC TAAGTCGACC 960 

TTCCATGCTG TCCC7GCCCA TAT7TGGGAC CGTCTTATGC TGTTCGGGGC CACCTTGGAT 102 0 

GACCAAGCCT TTTGCTGCTC CCG7T7AATG ACCTACCTTC GCGGCA7TAG CTACAAGGTC 108 0 

ACTGTTGGTA CCCTTGTGGC TAATGAAGGC TGGAATGCCT CTGAGGACGC CGTCACAGCT 114 0 

GTTA7CACTG CG3CCTACC7 7ACCA77TGC CAC3AG3GG7 ATG7C33CAC C3AGGCTA7A 12 JO 

TGCAA3GGGA TGCGTCGTCT GGAACGGGAG CATGC 2 TAG A AG777ATAAC A3GCC72TAC 1260 

AGC7GGCTC7 TCGAGAAGTC CGGCC3TGAT TACATCOCTG GCC3T3A377 GGAGTTCTAC 1320 

G3 33AG7GCA GGCGCTGGCT CTCCGCCGGC TTTCATCTTG ATC 3ACGGGT G77GG7T77T 1330 

■oA3 3A 37CGG CZCC 37 G 3 2A T7G7AGGACC G3GATCCG7A AGGCGGTCTC AAAG777T3C 144.) 

T 3G77CA7GA AG7GGCT7G3 7 2AGGAG7GC ACC7GC77CC TT 2AGCGTGC AGAAGGCGCC 150) 

GZCGGZGACG AGGG7CA7GA TAA73AAGCC 7A7GAGGGG7 CCGA7G77GA CCG7 3C7GAG 15 6- 
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2 . 4 0 




TCATTGGC :T 


-TT : . 3CC 2CG 


TTCCCG 33CG 


■j ■ j ■■ _■ l G T T T G 
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2: :-o 


aatocattct 
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. p , ^ ^ ,^ „ 


TGAT 30 3 3TC 


2 1 0 


tcta3t::ag 


COCGGCCTGA 


.^rp rp^.-^ ~ rp rp rp rri 


ATGT :~ 


CTTCTATACG 


TAGTAGGGCG 


2220 



GCC.A Z 3CCTA 0 0 GT GG GG 

tct:;goccgg ogcttgct 



jC oggtotaccc coocgtgoac cggacocttc ccc:g3Tcgc 2230 

jA G C 3 3GCTTCT GGGGGTA Z ZG CGGGGGCGCC GGC OATAACT 



CACCAGAC3G CCCGGCACGG CC GCCTGCTC TTCACCTACC CGGATGGCTC TAAGGTATTC 



2340 



24 00 

GCGGGCTCGC TGTTCGAGTC GACATGCACG TGGCTCGTTA ACGCGTCTAA TGTTGACCAC 24 60 

CGCCCTGGCG GCGGGCTTTG C CAT GOAT TT TA.CCAAAGGT ACCCCGCCTC CTTTGATGCT 2520 

GCCTCTTTTG TGATGCGCGA CGGCGCGGCC GCCTACACAC TAACCCCCCG GCCAATAATT 2530 

3ACGCTGTCG CCCGTGATTA TAGGTTGGAA CATAACCCAA AGAGGCTTGA GGCTGCTTAT 2640 

CGGGAAACTT GCTCCCGCCT CGGCACCGCT GCATACCCGC TCCTCGGGAC CGGCATATAC 27 00 

CAGGTGCCGA TCGGCCCCAG TTTTGACGCC TGGGAGCGGA ACCACCGCCC CGGGGATGAG 27 60 

TTGTACCTTC CTGAGCTTGC TGCCAGATGG TTTGAGGCCA ATAGGCC3AC CCGCCCGACT 2 820 

CTCACTATAA CTGAGGATGT TGCACGGACA GCGAATCTGG CCATCGAGCT TGACTCAGCC 2380 

ACAGATGTCG GCCGGGCCTG TGCCGGCTGT CGGGTGAGCC CCGGCGTTG? TCAGTACCAG 2 94 0 

TTTACTGCAG GTGTGC 2TGG ATCCGGCAAG TCCCGCTGTA TCACCCAAGC CGAT 3TGGAC 3 0 00 

GTTGTCGT3G T000GACGC3 T3A3TTGC3T AATG 3 3T 3GG GG3GT3GCG3 CTTTGCTGGT 3060 

TTTAC ZCGGG ATACTGCC 3C GAGAGTCAC3 CAGGGGGGZZ GGGTGGTCAT TGATGAGGGT 3 12) 

CCATGCCT3C CC 3GTCAGCT GCTGCTGCTC CACATGCAGC GGGCCGCCAC G3TCCACCTT 3130 

CTTGG 2GACC CGAACCAGAP C33AG3GAT3 GAGTTTGAGG AGGCTGGGGT G3TG3G3333 324 J 

ATGAG 3CGG 3 A0TTAGGCC0 CACCTCOTGG TGGCATGTTA CC 3ATCGGTG GCCTGC 3GAT 3 30.1 

GTATG 3GAGC TCATCCGTGG TGCATACCCO AT G AT CC AG A CCACTAGC3G GGTTCTCCGT 3360 

i --jTTGTTCT GGZGTGAGZG TGCCGTCGGG CAGAAACTAG TGTTCACCCA GGCGGGCPiAG 342 j 



GTT3C7CT3A 7GGGC 2 AC AC 7GAGAAG7 3C GTCATCATTG ACGCAC 3AGG :CT3CTTCGC 3600 
3AGGT 3GGCA 7C7CC3A7GC AAT3G77AA7 AA277TT7CC 7C3C73G73G G3AAATTG 3T 3660 
CACCAGCGGC CA7CA377AT TCCCCGTG 3C AA3CC73ACG CGAAT3TT3A CACCCT3G IT 3720 

GC3TTCC3GC 7GTCTTGCCA GATTAGTGCG TT3CA7CAGT TGGCTGAGGA GGTTGGCCAC 3730 

agacctg::c :tgtt3-:agc ?gtt:ta"a gc:tg:g:;g agctg3aa:a gsgc gttgtc 3340 

TA-.',T,^C::C AG3AG-3 7 3AG CAC3T37GAT AG737 3 3TAA CA7T7 3AA7T AACA 3ACA 37 3?J0 

gtggactjgc gcatgg::gc czcsagcgag 33caa33C3g t3C7 3?c:ac a:tt:tg",c 3?60 

gg37acggcg gt:g:a:aaa ggt'jtagaat 3ctt::ca:t 3 , :3a , :gtt-:g tgaitctztc 4020 

GCZCGTTTTA 7C3 3 33 3 3AT TGG3CC3G7A 3AGGT7ACAA C7TGT3AA7T G7ACGAGC7A 4 030 

GTGGAGGCCA TGGTGGAGAA GGG-3CAGGAT GG3TC33C'3G TCCTTGAGGT TGATOTTTGC 414 0 

AAGCGTGACG 7 37 3CAGGA7 CAC 377C77C CAGAAAGATT G7AA-3AAGT7 CACGACAGGT 4 200 

GAGACCAT7G CCCA7GGTAA AGTGGGCCAG GGCATCTCGG CC7G3AGCAA GAC 37TCTGC 4 260 

GCCCTCTTTG GCCCTTGGTT CCGCGC7ATT GAGAAGGCTA 7TCT3GCCCT GCTCCCTCAG 4 320 

GG7GTGT7T7 ACGGTGATGC CT77GATGAC ACCGTCTTCT C3GC3GCTGT GGCCGCAGCA 4 380 

AAGGCATCCA TGG7G77TGA GAATGACTTT TCTGAGT7TG A37CCACCCA GAATAACTTT 4 4 40 

TCTCTGGGTC TAGAGTGTGC TATTATGGAG GAGTGTGGGA TGCCGCAGTG GC7CATCCGC 4 500 

CTGTATCACC TTATAAGGTC TGCGTGGATC TTGCAGGCCC CGAAGGAGTC 7C7GCGAGGG 4 560 

TTTTGGAAGA AACACTCCGG TGAGCCCGGC ACTCTTCTAT GGAATACTGT CT3GAATATG 4 620 

GCCGTTATTA CCCACTGTTA TGACTTCCGC GATTTTCAGG TGGCTGCCTT TAAAGGTGAT 4 680 

GATTCGATAG TGCTTTGCAG TGAGTATCGT CAGAGTCCAG GAGCTGCTGT CCTGAT2GCC 4 740 

GGCTGTGGCT TGAAGTTGAA GGTAGA7TT3 CGCCC 3ATC3 GTTTGTATGC AGGTGTTGTG 4 300 

GTGGCCCCCG GCCTTGGCGC GGTCCC7GAT GTTGTGCGCT TC3CCGGCC3 GCTTACCGAG 4 360 

AA3AATTGGG G3CGTGGCC3 TGAGG3GGC3 GAGGA3CTG 3 GC3TC3CT3T TAGT3ATTT 3 4 32 3 

CT3 3G2AAGG T 3A3GAAT 3T AGCT7AGAT 3 T3TGTG3AT3 7737773CG3 T3TT7A7GGG 4933 

GTTTCCCCTG GACTCGTTCA TAAC37GA7T G33ATGCTAC AGGC7 377GC TGATGGCAAG 504 0 

GCA3A777 3 A C73AGT3A3T AAAA-23A3T3 CTC3A3TTGA GAAA7TCAAT CTTGT3TG3G 5100 

GTGGAATGAA 7AACATG737 T7T3C7GC3C 3 3A7GG37r3 G7GACGA7 3C GCC3TCGGG3 5160 

TATTTTGTTG CTGCTCCTCA TGTTTTTGCC TAT3C73C33 G3G3CACC3C C3GG7CAGC3 5220 

~vj7CjTGGGC GGCGCAGCGG 3GGT7CGGGC G37GG7T73T GGGGTGACCG 528 'J 



CGCTTGGCGT GACCAGG 2CC AGCG OCCC 3C 3GTTGCCTCA C 3TCGTAGAC CTACCA3AGC 5460 

T3G3GCC3CG CC 3CTAAC CG CG3T0GCTCC GGCCCAT3AC A3CCCGC3AG TGCCTGATGT 5520 

G ^AOrGGGGC GG3GCCATCT TG3GTCGGCA GTATAAC OTA TCAACATCTC 2CCTTAC OTG 5 5 30 

TTCCGTGGCC AC OGGCACTA AGGTGGTTGT TTATGCCGCC CCTCTTAGTC C3CTTTTA0C 5640 
C3TTCAGGAC GGGAC OAATA 3CTACATAAT GGCGAC3 3AA GCTTCTAATT AT 3CCCA3TA 



-oTTGGC C3C3C3ACAA. TC7 3TTA3 3G "C7 3CT3 3TC 



j j^'ju r TA o 'GO 



A ^ ^ ^ m 4 - x • i - - ^ j'o 1 ^ -A_^j^A^_ w'.GA^^TGCG TTjATA. m GAA 5B'2 0 

TTGAATAAGC TGGACGGATG TTC 3TATTTT A3T OGAGG 0 3 3GCATAGCCT CTGAGCTTGT 5.30 
GAT OOG.AAGT GAGCGT OTAC AC7ATC3 FAA CCAAGGTT 3G GGGTCOGTCG AGACOTCTGG 



5 94 0 



G GCCCACT3 3T 6000 



bGT3 3TTGAG GAG GAG GOT A C3TCTGGT7T TGTTAT 3CTT TGCATACATc 

AAATTCTTAT ACTAATACAC CCTATACCGG TGCCCTCGGG CTGTTGGACT TTGCCCT FGA 6060 

GCTTGAGTTT CGCAACCTTA CCCCCGGTAA CACCAATACG CGGGTCTCCC GTTATTC OA 3 6120 

CACTGCTCGC CACCGCCTTC GTCGCGGTGC GGACGGGACT GCCGAGCTCA CCACCACGGC 613 0 

TGCTACCCGC TTTATGAAGG ACCTCTATTT T AC TAG TACT AATGGTGTCG GTGAGATCGG 624 0 

CCGCGGGATA GCCCTCACCC TGTTCAACCT TGCTGACACT CTGCTTGGCG GCCTGCCGAC 6300 

AGAATTGATT TCGTCGGCTG GTGGCCAGCT GTTCTACTCC CGTCCCGTTG TCTCAGCCAA 6 3 60 

TGGCGAGCCG ACTGTTAAGT TGTATACATC TGTAGAGAAT GCTCAGCAGG ATAAGGGTAT 64 20 

TGCAATCCCG CATGACATTG ACCTCGGAGA ATCTCGTGTG GTTATTCAGG ATTATGATAA 64 80 

CCAACATGAA CAAGATCGGC CGACGCCTTC TCC AGCCCCA TCGCGCCCTT TCTCTGTCCT 654 0 

TCGAGCTAAT GATGTGCTTT GGCTCTCTCT CACCGCTGCC GAGTATGACC AGTCCACTTA 6 600 

TGGCTCTTCG ACTGGCCCAG TTTATGTTTC TGACTCTGTG ACCTTGGTTA ATGTTGCGAC 6660 

CG3CGC3CAG GCCGTTGCCC GGT 3GCTCGA TTGGACCAAG GTCACACTTG ACGGTCGCC 2 6720 

CCT2TC2ACC AT 3 3 AG G AG T ACTG 3AAGAG 3TTGTTT 3T2 CT3CCGCTCC GGG3TAAGGT 6^30 
CTTTTT2TGG GAGGCAGGCA CAACTAAA3G C3GGTACC3T TATAATTATA ACA33ACTGG 
TAGCGACCAA CTGCTTGTCG AG.AATGCC3C CZG'ZC^ZZ 3Z GTCGCTATTT C 2 ACT TAG AC 
CA3TAGC2T 3 GGTGCT 3GTC C3GT3TCCAT TTTTGCGGTT GCC 3TTTTAG 2CCZZZACTZ 

TGC3CTA3CA TTGCTTGA3G ATACCTT 3GA CTA3:CTGC3 CGC 3C33ATA CTTTT3AT3A 7020 

TTTCTGC2CA GAGTGCC3CC CCCTTGGCCT TCAGGGCTGC GCTTTC 3AGT CTACTGTC 3C 7080 

TGA 3CTTCAG CGCCTTAAGA TGAAGGT3GG TAAAACTCGG GAGTTGTAGT TTATTTGCTT 714 0 



6340 
6 900 
6 96C 1 
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1 9 



[2; INFORMATION :0; 



R 5EQ ID NO: 1 : 



SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 169? amino acids 
(E) TYPE; amino acid 



\ 13 ; ilrr.. a IT 1 IPO dClu 

(D) TOPOLOGY: linear 



MOLECLLE 



TYPE: prctein 



SEQUENCE DESCRIPTION: SEQ I 



D NO: 



His Gin ?he lie Lys A^a Pro Gly lie Thr Thr Aia lie 




Glu Gin Ala Ala Leu Ala Ala Ala Asn Ser Ala Leu Ala Asn ai a Vai 
20 25 30 

Vai Vai Arg Pro Phe Leu Ser His Gin Gin lie Glu lie Leu He Asn 
35 AO 45 

Leu Met Gin Pro Arg Gin Leu Vai Phe Arg Pro Glu Vai Phe Trp Asn 
50 55 60 

His Pro He Gin Arg Vai He His Asn Glu Leu Glu Leu Tyr Cys Arg 
65 70 75 80 

Ala Arg Ser Gly Arg Cys Leu Glu He Gly Ala His Pro Arg Ser He 

85 90 95 

Asn Asp Asn Pro Asn Vai Vai His Arg Cys Phe Leu Arg Pro Vai Gly 
100 105 HO 

Arg Asp Vai Gin Arg Trp Tyr Thr Ala Pro Thr Arg Gly Pro Ala Ala 
115 120 125 

Asn Cys Arg Arg Ser Ala Leu Arg Gly Leu Pro Ala Ala Asp Arq Thr 
130 135 140 

Tyr Cys Leu Asp Gly Phe Ser Glv Cys Asn Phe Pro Ala Glu Thr Gly 
1^5 150 155 160 

He Ala Leu Tyr Ser Leu His Asp Met Ser Pro Ser Asp Vai Ala Glu 
165 170 175 

Ala Met Phe Arg His Gly Met Thr Arg Leu Tyr Ala Ala Leu His Leu 

133 135 190 

Pro Pro Glu Vai Leu Leu Pro Pro Gly Thr Tyr Arg Thr Ala Ser Tyr 
195 200 205 

Leu Leu He His Asp Gly Arg Arg Vai Vai Vai Thr Tyr Glu Gly Asd 

210 215 220 

Thr Ser Ala Gly Tvr Asn His Aso Vai Ser Asn Leu Arg Ser Tro He 
225 230 ^ 235 " 240 

Arg Thr Thr Lys Vai Thr Gly Asp His Pro Leu Vai He Glu Arg Vai 
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Gly Cys His Phe Val Leu Leu Leu Thr Aia Ala Pro Glu 



Pro Ser Pro Met Pro Tyr Val Pro Tyr Pro Arg Ser Thr Glu Val Tyr 
2o 230 235 

Val Arg Ser lie Phe Gly Pro Gly Gly Thr Pro Ser Leu Phe Pro Thr- 
2 30 205 300 

Ser Cys Ser Thr Lys Ser Thr Phe His Ala Val Pro Ala His lie TrD 

335 310 315 320 

Asp Arg Leu Met Leu Phe Gly Ala Thr Leu Asp Asc Gin Ala Phe Cys 
325 330 ' 335 

Cys Ser Arg Leu Met Thr Tvr Leu Arg Gly lie Ser Tyr Lvs Val Th^- 

340 34 5 350 

Val Gly Thr Leu Val Ala Asn Glu Gly Trp Asn Ala Ser Giu Asp Ala 
355 360 " 365 

Leu Thr Aia Val ile Thr Ala Ala Tyr Leu Thr lie Cys His Gin Arq 

370 375 380 

Tyr Leu Arg Thr Gin Ala lie Ser Lys Gly Met Arg Ara Leu Glu Arg 
335 390 395 " 400 

Glu His Ala Gin Lys Phe Ile Thr Arg Leu Tyr Ser Trp Leu Phe Glu 
405 410 415 

Lys Ser Gly Arg Asp Tyr Ile Pro Gly Arg Gin Leu Glu Phe Tyr Ala 
420 425 430 

Gin Cys Arg Arg Trp Leu Ser Ala Gly Phe His Leu Asp Pro Arg Val 
435 440 445 

Leu Val Phe Asp Glu Ser Ala Pro Cys His Cys Arg Thr Ala Ile Arg 
450 455 460 

Lys Ala Leu Ser Lys Phe Cys Cys Phe Met Lys Trp Leu Gly Gin Glu 
465 470 475 480 

Cys Thr Cys Phe Leu Gin Pro Ala Glu Gly Ala Val Gly Asp Gin Gly 
485 490 495 

His Asp Asn Glu Ala Tyr Giu Gly Ser Asp Val Asc Pro Ala Glu Ser 

500 505 ' 510 

Aia lie Ser Asp lie Ser Gly Ser Tvr Val Val Pro Gly Thr Ala Leu 
515 520 525 

Gin Pro Leu Tyr Gin Ala Leu Asd Leu Pro Ala Glu Ile Val Ala Arg 

530 535 540 

Aia Gly Arg Leu Thr Ala Thr Val Lys Vai Ser Gin Val Aso Giv Arg 
545 550 555 ' " 560 

lie Asp Cys Glu Thr Leu Leu Gly Asn Lys Thr Phe Arg Thr Ser Phe 

565 570 575 
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Va_ Asp Gly Aia Val Leu -J 1m Thr Asn Giy Pro Glu Ar: His Asn Leu 
5 - 58 5 J 5 35 

Ser ?r '~ A fP Ala Ser 3 — Thr Met Ala Ala Giy Pro Phe Ser Leu 

3 5 3 5 0 C; 60 5 

Thr Tyr Ala Aia Ser Ala Ala Giy Leu Glu Vai Arg Tyr Vai Aia Aia 
blJ 615 620 

~ eu As ? Hl5 Ar '3 Ald "ai ?he Ala Pro Giy Val Ser Pro Arq Se>- 

o2d 630 635 640 

Aia Pro Giy Glu Val Thr Ala Phe Cys Ser Ala Leu Tyr A-g Ph- Asn 
645 650 655 

Arg G^u Ala Gin Arg His Ser Leu lie Gly Asn Leu Trp Phe His Pro 
660 665 ' 67Q 

Glu Gly Leu lie Giy Leu Phe Ala Pro Phe Ser Pro Gly His Val T-d 
675 680 685 

Glu Ser Ala Asn Pro Phe Cys Giy Glu Ser Thr Leu Tyr Thr Arq Th- 

690 695 700 

Trp Ser Glu Val Asp Ala Val Ser Ser Pro Ala Arg Pro Asp Leu Glv 
705 710 715 720 

Phe Met Ser Glu Pro Ser lie Pro Ser Arg Ala Ala Thr Pro Thr Leu 
725 730 735 

Aia Ala Pro Leu Pro Pro Pro Ala Pro Asp Pro Ser Pro Pro Pro Ser 
740 745 750 

Ala Pro Aia Leu Aia Glu Pro Ala Ser Gly Ala Thr Ala Gly Ala Pro 
755 760 765 

Ala lie Thr His Gin Thr Ala Arg His Arg Arg Leu Leu Phe Thr Tvr 
770 775 780 

Pro Asp Gly Ser Lys Val Phe Ala Gly Ser Leu Phe Glu Ser Thr Cvs 
785 790 795 800 

Thr Trp Leu Val Asn Ala Ser Asn Val Asp His Arg Pro Gly Gly Glv 
805 810 815 

Leu Cys His Ala Phe Tyr Gin Arg Tyr Pro Ala Ser Phe Asp Ala Ala 
820 825 830 

Ser Phe Val Met Arg Asp Gly Aia Ala Ala Tyr Thr Leu Thr Pro A-o 
B35 840 845 

Pro lie lie His Aia Vai Ala Pro Asd Tvr Arg Leu Glu His Asn P-o 
850 855 860 

Lys Arg Leu Giu Aia Ala Tyr Arg Glu Thr Cys Ser Arg Leu Gly Thr 
865 870 875 880 

Aia Ala Tyr Pro Leu Leu Giy Thr Gly lie Tyr Gin Val Pro lie GW 
885 890 895 

Pro Ser Phe Asp Aia Trp Glu Ar- Asn His Ara Pro Gly Asd Glu Leu 
900 905 " 9ib 



Tyr Leu Pro Glu Leu Ala Ala Arg Trp ?he Glu Ala Asr. A^u = - 

915 521 925 ' ' 

93C 935 " 94 q ~ e ' 

Ala lie Glu Leu As? Ser Ala Thr Asp Val Giy Arg Ala Cys Ala Gly 

943 950 955 96 < 

Cys Arg Val Thr Pro Gly Val Val Gin Tyr Gin Phe Thr Ala Gly Va 1 
965 970 975 

Pro Gly Ser Gly Lys Ser Arg Ser He Thr Gin Ala Aso Val Asd V a i 
930 935 ■ 990 

Val Val Val Pre Thr Arg Glu Leu Arg Asr. Ala Tro Arg Arg Arg Giy 
993 1000 1005 

Phe Ala Ala Phe Thr Pro His Thr Ala Ala Ara Va 1 T r - c 1 n ■■■ A-i 

10o 1020 

Arg Val Val He Asp Glu Ala Pro Ser Leu Pro Pro His T e- K=u 
1025 1030 1035 " " 1^0 

Leu His Met Gin Arg Ala Ala Thr Val His Leu Leu Giy Asp Pro Asn 
1045 1050 1055 

Gin He Pro Ala He Asp Phe Glu His Ala Gly Leu Val P-o Ala I^e 
1060 1065 1070 

Arg Pro Asp Leu Gly Pro Thr Ser Trp Trp His Val Thr H-s Ara T-d 
1075 1080 1085 " 

Pr ° ^L ASP Val Cys Glu Leu Ile Ar 9 G1 y Ala Pro Met He Gin 

1090 1095 iioo 

Thr Thr Ser Arg Val Leu Arg Ser Leu Phe Trp Gly Glu Pro Ala Val 
1105 HIO 1115 H20 

Gly Gin Lys Leu Val Phe Thr Gin Ala Ala Lys Pro Ala Asn Pro Gly 
1125 H30 H35 

Ser Val Thr Val His Glu Ala Gin Gly Ala Thr Tyr Thr Glu Thr Thr 
1140 H45 H50 

lie lie Ala Thr Ala Asp Ala Arg Gly Leu Ile Gin Ser Ser Arq Ala 
1155 H60 H65 

His Ala lie Val Ala Leu Thr Arg His Thr Glu Lys Cvs Val lie He 
1173 11 7 5 1130 

Asp Ala Pro Gly Leu Leu Arg Glu Val Gly Ile Ser Asd Ala 1 1 ■= Val 
1185 U90 1195 ' " 1200 

Asn Asn Phe Phe Leu Ala Giy Giy Glu lie Gly His Gin Ara Pro Se- 
1205 1210 ' 1215 

Val Ile Pro Arg Giy Asn Pro Asd Ala Asr. Val Aso Thr Leu Ala A^a 
1220 1225 " 1230 

Phe Pro Pro Ser Cys Gin lie Ser Ala Phe His Gin Leu Aia Glu Glu 



1235 124: H45 

ieu G^y^His Arg ?r: Val ?:: 7a i Aid Ala 7 a 1 Leu Pro Pro Cys Pro 

GIu Leu GIu Gin Gly Leu Leu Tyr Leu Pro GIr G ■ u T eu m ^r Cys 
1265 1270 12-5 * 1280 

Asp Ser Val Vai Thr Phe GIu Leu Thr Asp lie Vai His Cys Arq Me^ 
1285 129C 1295 

Ala Ala Pro Ser Gin Arg Lys Ala Val Leu Ser Thr Leu Vai Gly Arq 
1300 1305 1310 

Tyr Gly Gly Arg Thr Lys Leu Tvr Asn Ala Ser His Ser Aso Vai Arg 
131 ^ 1320 1325 

Asp Ser Leu Ala Arg Phe lie Pro Ala lie Gly Pro Val Gin Val Thr 
13 30 1335 1340 

Thr Cys GIu Leu Tyr Giu Leu Val Giu Ala Meo Val GIu Lys Gly Gin 
1345 1350 1355 1360 

Asp Gly Ser Ala Val Leu Giu Leu Aso Leu Cys Asn Arg Asd Val Ser 
1365 1370 ' 1375 

Arg lie Thr Phe Phe Gin Lys Asd Cys Asn Lys Phe Thr Thr Gly G ■ u 
1380 1385 1390 

Thr He Ala His Gly Lys Val Gly Gin Gly He Ser Ala Trp Ser Lys 
1395 1400 1405 

Thr Phe Cys Ala Leu Phe Gly Pro Tro Phe Arg Ala He Giu Lys Ala 
1410 1415 1420 

He Leu Ala Leu Leu Pro Gin Gly Val Phe Tyr Gly Asp Ala Phe Asp 
1425 1430 1435 1440 

Asp Thr Val Phe Ser Ala Ala Val Ala Ala Ala Lys Ala Ser Met Val 
1445 1450 1455 

Phe Giu Asn Asp Phe Ser Giu Phe Asp Ser Thr Gin Asn Asn Phe Ser 
1460 1465 1470 

Leu Gly Leu Giu Cys Ala He Met Giu Giu Cys Gly Met Pro Gin Trp 
1475 1480 1485 

Leu He Arg Leu Tyr His Leu He Arg Ser Ala Tro He Leu Gin Ala 
1490 1495 1500 

Pro Lys Giu Ser Leu Arg Gly Phe Trp Lys Lys His Ser Gly Giu Pro 
1505 1510 ~ 1515 1520 

Gly Thr Leu Leu Trp Asn Thr Val Trn Asn Met Ala Val He Thr His 
1525 1530 1535 

Cys Tyr Asp Phe Arg Asp Phe Gin Vai Ala Ala Phe Lys Gly Asio Asp 
1540 1545 1550 

Ser He Val Leu Cys Ser GIu Tyr Arg Gin Ser Pre Gly Ala Ala Vai 
1555 1560 1565 



Le: 



31 y Cys Gly Leu Lys Leu Lys Val Asc Phe 



^ 7 5 1 5 e : 

'^^y^-^- 1y r A~a Giy /al Val Val Ala ?ra Gly Leu 31 v Ala Leu Pro 
1555 159 ' : 1535 ' 1600 

Asp Val Val Arg Phe^Ala Gly Arg Leu Thr Glu Lys Asn Trp Gly Pro 
1605 1610 1615 

Gly Pro Glu Arg Ala Glu Gin Leu Arg Leu Ala Val 3er Asp Phe Leu 
1620 1625 1630 

Arg Lys Leu Thr Asn Val Ala Gin Met Cys Val Asp Val Val Ser Arq 
1635 1640 1645 

Val Tyr Gly Val Ser Pro Gly Leu Val His Asn Leu He Gly Met Leu 
1° 5:) 1655 1560 

Gin Ala Val Ala Asp Gly Lys Ala His Phe Thr Glu Ser Val Lys 
16oD 1670 1675- " 1630 

Val Leu Asp Leu Thr Asn Ser He Leu Cys Arg Val Glu 
1685 1690 

(2) INFORMATION FOR SEQ ID NO : 3 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 660 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8 : 

Met Arg Pro Arg Pro lie Leu Leu Leu Leu Leu Met Phe Leu Pro Met 
15 10 is 

Leu Pro Ala Pro Pro Pro Gly Gin Pro Ser Gly Arg Arg Arg Gly Arq 
20 25 30 

Arg Ser Gly Gly Ser Gly Gly Gly Phe Trp Gly Asp Arg Val Asp Ser 
35 40 45 

Gin Pro Phe Ala He Pro Tyr He His Pro Thr Asn Pro Phe Ala Pro 
5 ° 55 60 

Asp Val Thr Ala Ala Ala Gly Ala Gly Pro Arg Val Arg Gin Pro Ala 

6o 70 75 80 

Arg Pro Leu Gly Ser Ala Trp Arg Asp Gin Ala Gin Arg Pro Ala Val 
85 90 95 

Ala Ser Arg Arg Arg Pro Thr Thr Ala Gly Ala Ala Pro Leu Thr Ala 

100 105 no 

Val Ala Pro Ala His Asp Thr Pro Pro Val Pro Asp Val Aso Ser Arg 
115 120 125 

Gly Ala He Leu Arg Arg Gin Tyr Asn Leu Ser Thr Ser Pro Leu Thr 
130 135 14 0 



93 



Thr Glu Ala Ser Asn Tyr Ala Gin Tyr Arg Val Ala Arg Ala Thr He 
180 135 19Q 

Arg Tyr Arg Pro Leu Val Pro Asn Ala Val Glv Glv Tyr ai^ Ti e ^o- 
195 200 ' ^ 205 

He Ser Phe Trp Pro Gin Thr Thr Thr Thr Pro Thr Ser Val Asd Me^ 

210 215 220 

Asn Ser He Thr Ser Thr Asp Val Arg He Leu Val Gin Pro Glv II- 
225 230 235 24^ 

Ala Ser Glu Leu Val He Pro Ser Glu Arg Leu His Tyr Arg Asn Gin 
245 250 255 

Gly Trp Arg Ser Val Glu Thr Ser Gly Val Ala Glu Glu Glu Ala Th^ 
260 265 270 

Ser Gly Leu Val Met Leu Cys He His Sly Ser Leu Val Asn Ser Tyr 
275 280 235 

Thr Asn Thr Pro Tyr Thr Gly Ala Leu Gly Leu Leu Asp Phe Ala Leu 
290 295 300 

Glu Leu Glu Phe Arg Asn Leu Thr Pro Gly Asn Thr Asn Thr Arg Val 
505 310 315 320 

Ser Arg Tyr Ser Ser Thr Ala Arg His Arg Leu Arg Arg Gly Ala Asp 
325 330 335 

Gly Thr Ala Glu Leu Thr Thr Thr Ala Ala Thr Arg Phe Met Lys Asp 
340 345 350 

Leu Tyr Phe Thr Ser Thr Asn Gly Val Gly Glu lie Gly Arq Glv He 
355 360 365 

Ala Leu Thr Leu Phe Asn Leu Ala Asp Thr Leu Leu Gly Gly Leu Pro 
370 375 380 

Thr Glu Leu He Ser Ser Ala Gly Gly Gin Leu Phe Tyr Ser Arg Pro 
385 390 395 400 

Val Val Ser Ala Asn Gly Glu Pro Thr Val Lys Leu Tyr Thr Ser Val 
405 410 415 

Glu Asn Ala Gin Gin Asp Lys Glv He Ala He Pro His Asp He Asp 
420 ^ 425 430 

Leu Gly Glu Ser Arg Val Val lie Gin Asp Tyr Aso Asn Gin His Glu 
435 440 * 445 

Gin Asp Arg Pro Thr Pro Ser Pro Ala Pro Ser Arg Pro Phe Ser Val 
4-0 455 460 
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Ser 7a! Thr Leu 7a 1 Asn 7a! Ala Thr Glv Ala Gin Ala 7al Ala Arg 

500 505 510 

Ser Leu Asp Trp Thr Lys 7a 1 Thr Leu Aso Glv Arg Pro Leu Ser Thr 
515 520 ' ^ 525 

He Gin Gin Tyr Ser Lys Thr Phe Phe 7al Leu Pro Leu Arg Gly Lys 
530 535 540 

Leu Ser Phe Trp Glu Ala Gly Thr Thr Lys Ala Gly Tyr Pro Tyr Asn 

550 555 560 

Tyr Asn Thr Thr Ala Ser Asp Gin Leu Leu 7a 1 Glu Asn Ala Ala Gly 
565 570 575 

His Arg 7al Ala He Ser Thr Tyr Thr Thr Ser Leu Gly Ala Gly Pro 

530 535 590 

7al Ser He Ser Ala 7a 1 Ala 7a 1 Leu Ala Pro His Ser Ala Leu Ala 

595 600 605 

Leu Leu Glu Asp Thr Leu Asp Tyr Pro Ala Arg Ala His Thr Phe Asp 
610 615 620 

Asp Phe Cys Pro Glu Cys Arg Pro Leu Gly Leu Gin Gly Cys Ala Phe 
62 5 630 635 640 

Gin Ser Thr Val Ala Glu Leu Gin Arg Leu Lys Met Lys Val Gly Lys 
645 650 655 

Thr Arg Glu Leu 
660 

(2) INFORMATION FOR 3EQ ID NO : 9 : 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 123 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9 : 

Met Asn Asn Met Ser Phe Ala Ala Pro Met Gly Ser Arg Pro Cys 
1 5 10 15 

Leu Gly Leu Phe Cys Cys Cys Ser Ser Cys Phe Cys Leu Cys Cvs Pro 
20 25 30 ' 

Arg His Arg Pro 7ai Ser Arg Leu Ala Ala 7a 1 7a 1 Gly Gly Ala Ala 
35 4 0 4 5 

Ala 7a I Pro Ala 7a 1 7al Ser Glv 7al Thr Gly Leu He Leu Se- =>-o 
50 55 ^ 60 
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ie: — - ^ne — e oin Pro Thr ^ro Ser Pro Pro Mer S*- 

65 - ^5 30 

Pro leu Arg Pro 3^y le^ Asp Leo Val Phe Ala Asn Pro Pro Aic His 

^ 90 9 5 

Ser Ala Pro leu Gly Val Thr Arg Pro Ser Ala Pro Pro Leu Pro His 
100 105 HQ 

Val Vai Asp Leu Pro Gin Leu Sly Pro Arg Arg 
115 120 

(2 ; INFORMATION FOR SEQ ID NO: 10: 

■I) SEQUFNCL CHARACTERISTICS: 

LA; LENGTH : 7 171 base pairs 
(3' TYPE: nucleic acici 
(C; STPA.NDEDME3S : double 
fc:- TOPCLOGY: linear 

(ii; MOLECULE TYPE: DNA 

(in) HYPOTHETICAL: NO 

uv: ANT I -SENSE : NO 

(vi; ORIGINAL SOURCE : 

CO INDIVIDUAL ISOLATE: Composite Mexico strain 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

GCCATGGAGG CCCACCAGTT CATTAAGGCT CCTGGCATCA CTACTGCTAT TGACCAAGCA 60 

GCTCTAGCAG CGGSCAACTC CGCCCTTGCG AATGCTGTGG TGGTCCGGCC TTTCCTTTCC 120 

CATCAGCAGG TTGAGATCCT TATAAATCTC ATGCAACCTC GGCAGCTGGT GTTTCGTCCT 18 0 

GAGGTTTTTT GGAATCACCC GATTCAACGT GTTATACATA ATGAGCTTGA GCAGTATTGC 24 0 

CGTGCTCGCT CGGGTCGCTG CCTTGAGATT GGAGCCCACC CACGCTCCAT TAATGATAAT 300 

CCTAATGTCC TCCATCGCTG CTTTCTCCAC CCCGTCGGCC GGGATGTTCA GCGCTGGTAC 360 

ACAGCCCCGA CTAGGGGACC TGCGGCGAAC TGTCGCCGCT CGGCACTTCG TGGTCTGCCA 4 20 

CCAGCCGACC GCACTTACTG TTTTGATGGC TTTGCCGGCT GCCGTTTTGC CGCCGAGACT 4 80 

GGTGTGGCTC TCTATTCTCT CCATGACTTS CAGCCGGCTG ATGTTGCCGA GGCGATGGCT 540 

S3CCAC 3GCA TGACCC 3CCT TTATGCAGCT TTCCACTTGC CTCCAGAGGT GCTCCTGC2T 600 

CCTGGCACCT AC 3GGACATC AT 3 3TACTTG CTGATCCACG ATGGTAAGCG CGCGGTTGT3 6 60 

ACTTATGA3G GTGACACTAG CGCCGGTTAC AATCATGAT3 TTGCSACCCT CCGCACAT3G 72 0 

ATCAGGACAA CTAAGGTTGT GGGTGAACAC CCTTTGGTGA TCGAGCGGGT GCGGGGTATT 730 

GTCACT TTGT3TTGTT GATCACT 3CG GCCCCTGAGC CCTCCCCGAT GCCCTACGTT 84 0 

lOCGC GTTCGACGGA GGTCTATGTC CGGTCTATCT TTGGGCCCGG CGGGTCOZCG 90 0 



jv i -.^^ OoMv^^o 1 ^ i G ^ .t^TCj T JAA'" T7CA277773 A3GCCG7CG7 CAC3 3ACA77 Qh,^ 

. 3;GGagcg77 7 cat octctt t:-G3g:tac3 3tcgac3acc aggccttttg gtg:tggagg 1020 

3AA.GGCTGGA AT3C3ACCGA 3GA7G3GC7G ACTGCA^TTA T7A 3GGC3GG TTAGGTGACA 11 40 

A7ATGTCATC AGCGTTATTT 33GGA333AG GGGATTT 3TA AGGGGATGGG GGGGGTTGAG 12 r 0 

GTTGAACATG GTGAGAAATT TATTT 3ACGC CTC7ACAGCT G 3CTAT7TGA GAA3TCAGG7 1260 

3G7GA77A3A T3CCA3GC3G 33AGCT37AG 77C7A3G77C AGTGGCGGCG 77GGT7ATC7 1320 

03cggg77 3c atct-: iaccc :ggcag:tta g777?tgatg agtcagtgoc ttgtagctgc 12 ~o 

73AACCAC3A TCCG30 3GAT 3 3CTGGAAAA TTTTG:TGTT 77A7 3AAG73 GCTCG 37CA.3 14 40 

jAgtgttctt gttt:ctgca C3cc3c;gag gggct ^gcgg gcgatcaagg tca7gacaat 12:0 

GAGGCCTA7G AAGGCT 3TGA TGTTGATACT GC7GAGCCTG CCAGGCTAGA CA7TAGAGGG 1560 

T C AT AC AT CG 7GGA7G 37CG G7 37 3TGCAA ACTGTCTATC AAGC7C7 3GA CC7GC7A3 37 1620 

GACCTGGTAG CT3GCGTAGC CCGA3TGTCT GCTACAGTTA CTGT7A7TGA AACC7CTGGG 1 58 D 

CGTCTGGATT GCCAAACAAT GATCGGCAAT AAGAC7TTTC TCACTACCTT TGT7GATGGG 174 0 

GCACGCCTTG AGGT7AACGG GCGTGAGCAG CTTAACCTCT CTTTTGACAG CCAGCAGTGT 18 00 

AGTATGGCAG CCGGCCCGTT TTGCCTCACC TATGC7GCCG TAGATGGCGG GCTGGAAGTT 18 60 

7ATTTTTCCA CCGC7GGCCT C 3AGAGCCGT GTTGTTTTCC CCCC7GGTAA TGC 73 CG ACT 1920 

GCCCCGCCGA GTGAGG7CAC CGCCTTCTGC TCAGCTCTTT ATAGGCACAA CCGGCAGAGC 198 0 

CAGCGCCAGT CGGTTA7TGG 7AGTTTG7GG CTGCACCCTG AAGGTTTGCT CGGCCTGTTC 2040 

CCGCCCTTTT CACCCGGGCA TGAGTGGCGG TCTGCTAACC CATTTTGCGG CGAGAGCACG 2100 

CTCTACACCC GCACTTGGTC CACAATTACA GACACACCCT TAACTGTCGG GCTAATTTCC 2160 

GGTCATTTGG ATGCTGCTCC CCACTCGGGG GGGCCACCTG CTACTGCCAC AGGCCCTGCT 2220 

GTAGGCTCGT CTGACTCTCC AGACCCTGAC CCGCTACCTG ATGTTACAGA TGGCTCACGC 228 0 

CCCTCTGGGG CCCGTCCGGC TGGCCCCAAC CCGAATGGCG TTCCGCAGCG CCGCTTACTA 2 34 0 

CACACCTACC CTGACGGCGC TAAGATCTAT GTC3GCTCCA TTTT 3GAGTC TGAGTGCACC 2 4 00 

T 3G2TTGT OA ACGCAT3TAA CGCCGGC7AC CGGZ2TGG7G GC3GGCTTT3 T7A7 GCTTTT 2 4 bo 

TTTCAGCGT7 ACCCT3ATT: GTTTGACGCG ACCAAGTTTG TGAT3C3TGA TGGT3TTGCC 2520 

G3GTATAC3C TTACA-2 27 3 3 GC3GAT7ATT 3A73CG3T3G CC3333AC7A T 2GATTGG.AA 2580 

CATAACCC3A AGAGGGT3GA GG3TGCCTAC CGCGAGACTT GC37C7GC7G AGGCACTGCT 2 64 0 

G33TATC3A3 TCTTAG323C TGGCATTTAG CAGG7G377G TTAG7T7GAG 77773A7GCC 27Cu 

T 3GGAGCGGA AC3AC3 37C3 G77TGA3GAG C777AC3TAA CAGAGC7GGC G3CTCGG7GG 27 6 0 
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2 820 


.j^r^v^.^o L.-.TGj^.T 7G-^7.GGG2 




' ^-^ _ o ^ A. ^ _ 


7G 3 3 3GG7 37 


233 3 


rLrv->.._: . _ o.-.o'- ^ * j^v, j . i j . ^ j _ ~. . w A^ 




o ^ j : „ w _ _ o G 




2 94 0 


i ^. m AAv:T„^vj Tj^AACAG ; jC GGA7G7 3GA7 


GT . jTT J 1 7^ 


7GGGGAG7 2G 


jA'j 37TCGG 


300 3 


An^ jCTTo'jC ^ 3GGCG GGGG GTTTGCGGCA 


T7 3AG70 3GG 


A3AG7 3GGGC 


3GG3 37 3AGT 


30 60 


AGG3GC33TA G3G7TG7GAT 7GA7 3A3GGG 


c:tt:gct:c 


g:c:a:a3T7 


GC7GG7777A 


3120 


GA7A7GGAGG G7G3TGGA7G 737G3AG37G 


— rp r-r, ■-. — 


G jAAT GAGA7 


GGGG 3 3 GATA 


3130 


GA77773A3C A3AG3GG7G7 GATTMAGGA 


A TAG ;iG- G" 3G 


A j77 3G7G 7G 


'jA 1 ^ l 7 ; -,A7>jG 


3 2 4 0 


73GGA7 37 3A 3 3GAGG377G G3G7 3GAGA7 


G7G7 7:GA;7 


rri , 


7GG7'7A3 3 37 


33 0 0 


AAAATCCAGA G7A.CAA.GTAA G37 3 37GG37 


7GGG7777 ;T 


GGG3AGAGGC 


AGOTGTG 3GG 


3 3 o 0 


gagaag:tag T377 0agaca ggctgctaag 


GGG ZiGGCkZZ 


3GGGA7GTAT 


AACGG7GGA7 


3420 


gaggcc:agg gtgctacttt tacgactaga 


AGTATAA7TG 


GAA37GGAGA 


^ VJk..U ... O „ 1 Jljl^ 


3 4 B 0 


ctgata:agt cgtcccgggg t:agggta:a 


GTT 3G7G7GA 


G7A 3GGA7AC 


7 G AAj-JAn T G T 


354 0 


gttatacttg agtctcgcgg cctgttgcgt 


GAGG 7 GGG 7 A 


7C7CAGATGC 


CA77G77AAT 


3 600 


AATTTCTTCC 77TCGGGTGG CGA3G7TGG7 


GAOGAGAGAC 


CATCGGTCAT 


7CGGCGAGGC 


3660 


AACCCTGACC GCAATGTTGA CGTGCTTGCG 


GGG7TTCGAC 


CTTCATGCCA 


AATAAGCGCC 


3720 


TTGCATCAGC TTGG7GAGGA GCTGGGCCAC 


CGGGGGGGGG 


CGG7GGCGGC 


TG7GGTAGGT 


3780 


CCGTGCCCTG AGCTTGAGCA GGGCCTTCTC 


7A7G7GGCAG 


AGGAGCTAGC 


C7GCTG7GAC 


3840 


AGTGTTGTGA CATTTGAGCT AAC7GACATT 


GTGCACTGCC 


GGA7GGCGGC 


CCCTAGCCAA 


3900 


AGGAAAGCTG TTTTGTCCAC GCTGGTAGGC 


CGGTATGGCA 


GACGCACAAG 


GCTTTATGAT 


3960 


GCGGGTCACA GCGATGTCCG CGCCTCCCTT 


GCGGGGT7TA 


TTCCGAG7GT 


GGGGGGGG7T 


4020 


ACTGCCAGGA CCTGTGAACT CTTTGAGCTT 


GTAGAGGGGA 


TGG7GGAGAA 


GGGCCAAGAC 


4080 


3GTTCAGCCG TCCTCGAGTT GGATTTGTGC 


AGCGGAGA7G 


TCTCCCGCAT 


AACCTTTTTC 


4140 


CAGAAGGATT GTAAGAAGTT CACGAGGGGC 


GAGAGAA77G 


CGGA7GGGAA 


AGTC3GTCAG 


4200 


GGTATGTTGG GC T GG AG TAA GACGTTTTGT 


GGG 37 3777G 


GGGG07GG77 


COGT 3 3GA77 


4 2-50 


GAGAAGGGTA T7 37A7 3GG7 T77AGGAGAA 


GG7G7 377G7 


A_ jG-joA7 


7 7A7 3AG 3AG 


4 32 3 


7GAG7AT7GT CT3GTGG 3GT GGCTGGZGGG 


AGGGA7G 3GA 


TGGTGTTTGA 


AAATGATTTT 


4330 


TGTGAGTTTG AGTGGAGTGA GAATAA3TTT 


7 3GG7AGG7 3 


77 3A37 3G3G 


^ A * i ."li j'j.Art 


4 4 4 J 


GAGTGTGGTA TGZGZ GAGTG GGTTGT GAGG 


77G7AGGA7 3 




jG i j7 3GA7G 


4 50 3 


ctggagggg: caaaagagtc tttgagaggg 


77G7GGAAGA 


A3GA77G7GG 


7GAGG0GGGG 


4 5 6] 


AG^TTGGTCT GGAATAC 3GT G7 3GAAGA7 3 


GGAA7GA77G 


GGGA77GG7A 


7GAG77GGGG 


4 620 
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j * --j i. T-G.-.T T-_ jCGGGACG G7777GGGAG AAGAACT jGG GGG37 3A7 02 GGAG 3GGG 3A 4 3 60 

3ag:agct:: gc:?ggccg? ggaggatttc ctcggtaggt taacgaat^t ggc:gagatt 4020 

3g7 377 3ag3 7 3g7g7c7ag ag777ac ogg gtttccccgg gtctggtt oa taacctgata 4 980 

.TOCATOOTOO AGAOTATTGG 7GA7 0G7AAG G3GCA777TA CAGAGTCTGT 7AAGG27A7A 504 0 

:ttga::tta gaoagtcaat ta7^:acogg tg?gaa7:aa taa-:a7Gt :g 77t:g?gcgc sioo 

7CA7GG3:70 GOOAGGATGG GGC 07AGGC 0 7C7777GC7G 77337337 :7 7G7T737G33 5160 

7A7G7T0:CG 3 0 3G0AGG3A CCGG7CAGG 0 37373G33GC GG03G7GG73 GGG XAGC GO 5220 

CGG7AC Z 3GC G07GG7T7 37 GGGG7 3AGCG GG77GA7T0T GAGCGCTTGG CAATCGGC7A 52 8 0 

7A77CA7ICA A0:AA00CC7 77GC0CCAGA GGTTGCCGGT GCGTCCGG 3T CTGGACGT OG 5340 

7G77GGCGAA G OAGC 0 0 GGC CA37 0GGC72 CAG77GGCGA GATCAGGCCG AGCGG 33CTC 54 00 

CGC7 3C27CC CG7CGCCGAC CTGCCACAGC CGGGGCTGCG GCGC7GACGG CTG7GGCGCC 54 60 

73CC3A7GAC AC:TCACCCG TCGCGGACGT TGATTCTCGC GGTGCAATTC TACGCCGCCA 5520 

G7A7AA777 3 TC7ACTTCAC CCCTGACATC CTCTGTGGCC 7CTGGCAC7A ATTTAGTCCT 5580 

GTATGCAGCC CCCCTTAATC CGCCTCTGCC GCTGCAGGAC GG7AC7AA7A CTCACATTAT 5 64 0 

GGCCACAGAG GCCTCCAATT ATGCAGAGTA CCGGGTTGCC CGCGCTAC7A TCCGTTACCG 57 00 

GCCGCTAGTG CCTAATGCAG TTGGAGGCTA TGCTATATCC A777CTT7CT GGCCTCAAAC 57 60 

AACCACAACC CCTACATCTG T7GACATGAA TTCCATTACT 7CCACTGA73 TCAGGATTCT 5820 

TGTTGAACGT GGCATAGCAT C7GAATTGG7 CATGGCAAGC GAGCGCCTTG ACTACCGCAA 58 8 0 

TCAAGGTTGG CGCTCGGTTG AGACATCTGG TGT7GCTGAG GAGGAAGCCA CCTCCGGTCT 5 94 0 

TGTGATGTTA TGCATACATG GCTCTCCAGT TAACTCCTAT ACCAATACCC CTTA7ACCGG 6000 

TGCCCTTGGC TTACTGGACT TTGCCTTAGA GCTTGAGT7T CGCAATCTCA CCAC 3TGTAA 6060 

CACGAATACA CGTGTGTCCC GTTAGTCGAG 3AC7GC7 3 37 CACTCCGGGG GAGGGGCCGA 6120 

GGGGAGTGGG GAGCTGAGGA CAACT GGAGG CACCAGG77 3 ATGAAAGATC TC OAGTTTAC 6130 

CGGCGITAAT GGGGTAGG7G AAG7CGGC3G CGGGATAGCT CTAACATTAO TTAAC7TT3G 6240 

TGAGAGGC7 3 CT ZGGGGGGC TG3CGACAGA ATTAATTTCG 733G37GG33 GGGAAG7GT7 630 3 

77A77CCC3-3 CCGGTTGTCT CAGCCAA73G CGAGGCAAGG GTGAAGCTCT ATACATCAGT 636 3 

G3AGAAT3G7 CAG3A3GATA AG3G7GT73C 7A7CCGC0AG GA7A73 3A7C 773G7GA77 3 64 20 

GGG7GTG 073 A77 3AGGA77 A73ACAAC OA GGA7GAGCAG GATGGGGGGA CCC2 37CG2 3 64 S 0 

10 4 



:agt 



;7C 3AAA 



■ j i _ Aw_Cj.CjACG 3GC33CC3CT 



AGCAAATGA7 GTAC777GGC 7G7CCC7CAC 

*.« j j oTOGTCAACT 0GCCGGGT77 A7A7GTCGGA 

.7G'j GGCGGAGGGG 37A30GC3A7 3GC7T3AC7G 

CGAG7G77 GAGCAATATT CCAAGACATT 



OTirSTSCTC CCCC77CGTG 
:7AT:C77A7 aattataata 



GCAA3CT7TC CTTTTGGGAG GCC 3GCACAA GAAAAGGAGG 
7TA3T3CTAG 7 3 AC ZAGAT T OTGATTGAAA ATGCTGCCGG 



:GAr:G3G7G GCOATTTCAA GOTATA^OAC GAG 



-.-i^-j^ i i oo'o 'jL ^GGT 0 7GG 7 2 37 OATTTC 
3GCT0TG OT G1AGGATA CTTTTZATTA 



70o:a:a:a7 ttiat 

17 J ZAjTCAA 07GT 2 
TTGTAGTTTA 777GG 



GG7 OAA 



AAC7CGGGAG 
AT77CC7T77 7C7CGGTCCC GCGCT 



k ■ J A O 



^o i „ J-..77 7AbGG37GCA 
7TAAAG77A A3G7GGG7AA 
:CCACCTACT 7A7A7C7GCT GA7T7CG77T 



CCTG A 



6540 
6500 
6660 
6720 
6 7 80 

68 4 0 

69 " 0 
69- 0 
"7 0GQ 
70^0 
7140 
7171 



■2- .^FORMATION FOR SEQ ID NO : 1 1 : 

(1) SEQUENCE CHARACTERISTICS: 

(A; LENGTH: 1575 base oairs 
CB) TYPE: nucleic acid" 
(C) STFANDEDNESS: double 
(DJ TOFOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: T: Mexican strain 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
GTTGGGTGAG GTGGGTATCT CAGATGCCAT TGTTAATAAT TTCTTCCTTT CGGGTGGCGA 
3GTTGGTCA0 CAGAGACCAT CGGTZATTCC 3CGA3GCAAC CCTGACCGCA ATGTT 3ACGT 
GOTTOOGGCO TTTOOACOTT CAOGOOAAAT AA3C23077C CA7 0AGG77G CT 3A33AGC7 
GGGG:A733G CC33CGC03G 73G03GG737 GG7AGG7C:C 73C0:7GAGC 77GA3CAGGG 
CCTTCTCTAT C7G7CACAGG AGC7AG7 07C C7G7 3A3AG7 G7737GACA7 77 3A3C7AAC 
7GA0A77 37G CAC7 3CCGCA 7GG:GGG0:G 7AGC0AAAGG AAAGCTGTTT 7GrC7A0GCT 
GGTAGGC 2 OG TAT 3 3 GAG AC G 3ACAAGGCT TTATGA7GCG GG7CACACC3 ATG7CC3CGC 



C7CGC77GCG CGCTTTATTC C0AC7C7 3GG G7 7GGTTACT 
7GAGC77G7A GAGGCGA7GG T G G A 3 AA G G G C ■: 2 AA G A CGG7 



aCCACCT GTGAACTCTT 



60 
120 

1 3 ; 

i . 1 . 

300 
3 60 



7CGAG77GGA 
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1 T:1" 



i T GC GAT T GAG AAGGC7A7TC TATC CCTTTT 72 0 

ACGACAAGGT GTGTTCTACG GGGATGCTTA TGAC GAGTCA GTATTCTCTG CTGCCGT3GC 730 

TGGCGCGAGC CATGCCATGG TGTTTGAAAA TGATTTTTCT GAGTTTGAC7 CGACTCAGAA 840 

TAACTTTTCC CTAGGTC77G AGTGCGCCAT TATGGAAGAG TGTGGTATGC CCCAGTGGCT 90 0 

ZGGTCGGG GTGGACCOTG GAGGCCGCAA AAGAGTGTTT 960 



TGTCAGGTTG TACCA7 0CCG TG 



GAGAGG GTTC TGGAAGAA30 AT? 0TGGT3A GCCGGGCACG TTGCTCTGGA ATACGGT 



1200 
12 60 
1320 



GAACATGGCA AT GAG T ZZZZ ATT3CTAT3A GTTCCGGOAC CTCCAGGTTG CGGTCTTGAA 1030 

uoLrCJACGAC TGGGTC-OT:: TCTGTAGT0A ATA GCGCCAG AGCCCAGGCG C033TTC3CT 114 0 
TATA3CAGGC TGTGGTTTGA AGTTGAAGGC TGACTT 0CGG CCGATTGGGC T3TAT3C0GG 
GGTT 0TG 3TC ZGZZZGGGGG T 0GGG3C OCT AGGGGATGTG GTTGGATTGG CC3GACGGCT 
TTCGGAOAAG AAOTGGGGGC CTGATGGGGA GGGGGCAGAG CAGCTCCGCC TCGCCGTGCA 

GGATTTOCTC C3TAGGT7AA CGAATGTGGC GGAGATTTGT GTTGAGGTGG TGTCTAGAGT 13 30 

TTACGGGGTT TCCCCGGGTC TGGTTGATAA CCTGATAGGC ATGGTGCAGA CTATTGGTGA 14 4 0 

TGGTAAGGCG CATTTTACAG AGTCTGTTAA GO C TAT ACT? GACGTTACAG ACTCAATTAT 15 00 

GGACGGGTCT GAATGAATAA CATGTGGTTT GCTGCGCCCA TGGGTTCGGG ACCATGCGCC 1560 

CTAGGCCTCT TTTGC ., c -, r 

lb tb 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 874 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(lii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE : NO 

■ivi) ORIGINAL SOURCE: 

[C) INDIVIDUAL ISOLATE : Tashkent strain 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

CGGGG OC0 3T ACAG3TCACA ACGTGTGAGT T3IACGAGC7 AGTGGAGGGC AT3GTCGAGA 60 

AAGGCCA3GA TGGCTCZZCC G7CCTT 3AGC TTGAICTCTG CAACCGTGAG GTGTGGAGGA 120 

TCAGCTTTTT CCAGAAAGAT T GC AATAAGT TGAC ZAGGGG AGAGACCATC GGGCATGGTA 13 0 
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^.o.^C^A --oG-ATTTCG GCG7GGAG7A AGAGC7TCTG TGGGGTTTTC GGCCGCTGGT 240 

TCCG7GC7A7 TGA0AAGGC7 A7T0TGGCCG 7GC7C0G70A GGGTGTGTTT TATGGGGATG 300 

CG777 GA7GA GAG OGTGTTG 7CG 3C3CG7G TGGCGGCAGO AAAGGCGTCG ATGGTGTTTG 360 

AGAATGACTT 77G7 3AGTTT GAGTG ZAGCG AGAAIAATTT TTCCCTGGGC CTAGAGTGTG 420 

CTATTATGGA GAAGT 3T3GG ATGCC 3AAGT GG OTCATCOG CTT3TACCAC GTTATAAGGT 4 30 

CT0CGTGGAI ZCZZGAGGGZ CG 3AAGG AG T GGGTGCGAGG GTGTTGGAAG AAAGACTCCG 54 0 

GTGA.3CCC 3G GAGTG TT OTA TG 3AATACTG TCTG jAACAT GGCOGTTATC ACCGATT3TT 600 

AGGA777G:G GGA77700AG GTGG07GCC7 7TAAAGGTGA T0A77GGATA G7 3CTTT OCA 660 



AGOG TCAGAGI 



■ ^ r Jul 7 1 jT ; j'jC 77AAAGC7 3A 72( 



jU * ,J1Ji ' JJ ' jT " ^GGTOC^ATT GG777 37A7G GAGG7G77G7 GG7GAGGGGG GGGCTTGGGG 780 

GGCTTGG 0GA CGTC3TGCGC 77G7GGGGGG GGG77AGTGA GAAGAAT7GG GGCCCTGGCC 84 0 

CTGAGCGGGC GGAGGAGGTC CGCCT7GCTG 7GGG 8 74 
(0; INFORMATION FOR SEQ ID MO : 1 3 : 

Ci) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 449 base pairs 

(B) TYPE: nucleic acid 

(C) STFANDEDNESS: double 

(D) TOPOLOGY: linear 

;ii) MOLECULE TYPE: cDNA to mRNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: Clone 406.4-2 cDNA 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 2.. 100 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

C GCC AAC CAG CC0 GGG CAC TTG GCT CCA GTT GGC GAG ATG AGG CCC 4 6 

Ala Asn Gin Pro Gly His Leu Ala Pro Leu Giv GIj lie Arg Pro 
1 5 10 15 

AGO GCG COT GCG CT 3 CGT CG0 0TC GCC GAG GTG CGA CAG CCG GGG CTG 94 
Ser Ala Pro Pro Leu Pro Pro Vai Ala Aso Leu Pro Gin Pro Gly Leu 
20 25 30 

COG GGG T j AC 3GCTGT GGCGGZTGCZ CATGAGACGT CAC DC 0TGGG GGACGTTGAT 150 
Arg Arg 



TCTC 



GCGGTG CAAT7GTAGG COGCCAGTAT AATTTGTGTA CTTCACGCGT GAGATCCTGT 
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GTG 3CCTCTG 


GCACTAA.TT 


A G T C C T G T A T G G A G C G G G G G 


TTAATCCGCC 


- - i -J - O ^> i. o 


27Q 




G T AA T A GTG A 


■~ AT T A T G G G G AG AG A G G G G T 


cgaattatg: 


A.GAGTAGGGG 


330 




G G AG T A. T G G G 


TTAGGGGGGG CTAGTGCGTA 


r\ ~ o^AjoTTGG 


A oG ^ T.-aT 'j C T 


39C 


ATATCCATTT 


l - i i 1 ^ i ou^ ^ 


TCAAACAACC ACAACCCCTA 


CATGTGTTGA 


CATGAATTC 


449 



(2) INFORMATION FOR SEQ ID NO: 14: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 33 amino acids 
(3) TYPE: amino acid 
t'D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

{xi) SEQUENCE DE3CRI ?T I DN : SEQ IG NO: 14: 

Ala Asn Gin Pro Gly His Leu Ala Pro Leu Gly GIu lie Arg Pro S^r- 
1 5 10 is 



Ala Pro Pro Leu Pro Pro Val Ala Asp Leu Pro Gin Pro Gly Leu Ara 
Arg 



-° ^ 30 



(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 130 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA to mRNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: Clone 406.3-2 

(ix> FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 5.. 130 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

GGAT ACT TTT GAT TAT CCG GGG CGG GGG CAC ACA TTT GAT GAC TTC TGC 
Thr Phe Asp Tyr Pro Gly Arg Ala His Thr Phe Aso Asp Phe Cys 
1 5 10 " 15 

CGT GAA TGC GGG GOT TTA GGG CTG CAG GGT TGT GCT TTC CAG TCA ACT 
Pro Glu Cys Arg Ala Leu Gly Leu Gin GIv Cys Ala Phe Gin Se- Th- 
20 25 30 

GTG GGT GAG CTG GAG GGG CTT AAA GTT AAG GTT 
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49 



97 



13: 




(2; INFORMATICS FDR SE^ 10 MO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 amino acids 
(3) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 6 : 

Thr Phe Asp Tyr Pro Gly Arg Ala His Thr Phe Aso Asc Phe Cys Pro 
1 ^ 10 ' ' 15 

GIu Cys Arg Ala Leu Gly Leu Gin Gly Cys Ala Phe Gin Ser Thr Va ' 
20 25 30 

Ala Glu Leu Gin Arg Leu Lys Val Lys Val 
35 40 

(2) INFORMATION FOr SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

fiv) ANTI-SENSE: NO 

<vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: 406.4-2 epitope - Mexican strain 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

Ala Asn Gin Pro Gly His Leu Ala Pro Leu Gly Glu He Arg Pro Ser 
1 5 10 15 

Ala Pro Pro Leu Pro Pro Vai Ala Asp Leu Pro Gin Pro Gly Leu Ara 

23 25' 30 

Arg 

2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUEN3E CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 
(3) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: unknown 
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: I-/.- a:-.:; -sense: no 

{'/!) OF IGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: 4C6. 4-2 epitope - Burma strain 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: IS: 

Ala Asn Pro Pro Asp His Ser Ala Pro Leu Gly Val Thr Arg Pro Ser 

1 5 10 15 

Ala Pro Pro Leu Pro His Val Val Asp Leu Pro Gin Leu Gly Pro Arg 

2C 25 30 

Arg 

(2; INFORMATION FOP SEQ ID NO : 1 9 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 amino acids 
;3) TYPE: amino acid 

STRANDEDNESS: single 
(0) TOPOLOGY: unknown 

(ii) lMOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

Civ) ANT I -SENSE: NO 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: 406.3-2 epitope - Mexican strain 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

Thr Phe Asp Tyr Pro Gly Arg Ala His Thr Phe Asp Asp Phe Cys Pro 
1 5 10 15 

Glu Cys Arg Ala Leu Gly Leu Gin Gly Cys Ala Phe Gin Ser Thr Val 
20 25 30 

Ala Glu Leu Gin Arg Leu Lys Val Lys Val 
35 40 

(2) INFORMATION FOR SEQ ID NO: 20: 

Ii) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 amino acids 
(3) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 
(in) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 
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'vi) ORIGINAL SC 



C) INDIVIDUAL 



4 C 6 . 3-2 ecitoce 



~a strain 



Leu Asp Tyr ?ro Ala Arg Ala His Thr ?he Aso Asp ?he Cys Pro 
5 10 ' 15 



GIu Cys Arg Pro Leu Giy Leu Gin Gly Cys Ala Phe Gin Ser Thr Val 
20 25 



Ala Glu 



Leu Gin Arg Leu Lys Met Lys Val 
35 40 



